WIP: Test CI#14
Conversation
- Introduce the workspace store abstraction (SQLAlchemy + REST), name validation, and registry/URI utilities with CLI flags and env vars. - Expose workspace CRUD endpoints in the Flask server with proper guards/feature flag wiring plus MlflowClient surface methods. - Add the Workspace entity, tests for the stores/registry/client, and docs entries so the new multi-tenant provider can be exercised end-to-end. Signed-off-by: mprahl <mprahl@users.noreply.github.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis pull request introduces comprehensive multi-tenancy workspace support to MLflow, enabling logical isolation of experiments, models, and permissions within a single server. New workspace store implementations, CRUD APIs, database schema, server routing, permission management, and workspace context management are added alongside extensive test coverage and documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as MLflow Client
participant Tracking as Tracking Service
participant WorkspaceCtx as Workspace Context
participant Store as Tracking Store
participant Auth as Auth Backend
Client->>WorkspaceCtx: set_workspace("team-a")
WorkspaceCtx->>WorkspaceCtx: Store in ContextVar
Client->>Tracking: create_experiment("exp1")
Tracking->>WorkspaceCtx: get_current_workspace()
WorkspaceCtx-->>Tracking: "team-a"
Tracking->>Store: create_experiment("exp1", workspace="team-a")
Store->>Auth: check_permission(user, "team-a", "EDIT")
Auth-->>Store: Permission granted
Store->>Store: Insert experiment with workspace="team-a"
Store-->>Tracking: Experiment created
Tracking-->>Client: Experiment("exp1", workspace="team-a")
Client->>WorkspaceCtx: set_workspace("team-b")
Client->>Tracking: get_experiment(exp1_id)
Tracking->>Store: get_experiment(exp1_id, workspace="team-b")
Store->>Store: Query where workspace="team-b" AND id=exp1_id
Store-->>Tracking: Not found (workspace isolation)
Tracking-->>Client: MlflowException(RESOURCE_DOES_NOT_EXIST)
sequenceDiagram
participant Server as MLflow Server
participant Request as HTTP Request
participant Handler as Workspace Handler
participant AuthZ as AuthZ Service
participant Handler2 as Request Handler
Request->>Server: POST /api/2.0/mlflow/runs/search<br/>X-MLFLOW-WORKSPACE: team-a
Server->>Handler: workspace_before_request_handler()
Handler->>Handler: Extract workspace from header
Handler->>Handler: Store in g.mlflow_workspace
Handler->>AuthZ: Validate workspace access
AuthZ-->>Handler: Authorized
Handler-->>Server: Context set
Server->>Handler2: Process request (search runs)
Handler2->>Handler2: Filter runs by workspace="team-a"
Handler2-->>Server: Return filtered runs
Server->>Handler: workspace_teardown_request_handler()
Handler->>Handler: Clear g.mlflow_workspace
Handler-->>Server: Cleanup complete
Estimated code review effort🎯 5 (Critical) | ⏱️ ~150 minutes Specific areas requiring extra attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings, 1 inconclusive)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
mlflow/server/handlers.py (1)
535-545: Workspace enablement incorrectly ignoresMLFLOW_ARTIFACT_ROOTenv varThe guard only checks
default_artifact_root, but the error message promises that either--default-artifact-rootorMLFLOW_ARTIFACT_ROOTis sufficient. If a user enables workspaces via--enable-workspacesand relies solely onMLFLOW_ARTIFACT_ROOT, this will incorrectly raiseINVALID_PARAMETER_VALUE.You can fix this by incorporating the env var into the check:
def initialize_backend_stores( backend_store_uri: str | None = None, registry_store_uri: str | None = None, default_artifact_root: str | None = None, workspace_store_uri: str | None = None, ) -> None: - if MLFLOW_ENABLE_WORKSPACES.get() and not default_artifact_root: - raise MlflowException.invalid_parameter_value( - "--enable-workspaces requires --default-artifact-root or the " - "MLFLOW_ARTIFACT_ROOT environment variable to be set." - ) + if MLFLOW_ENABLE_WORKSPACES.get(): + from mlflow.server import ARTIFACT_ROOT_ENV_VAR + artifact_root = default_artifact_root or os.environ.get(ARTIFACT_ROOT_ENV_VAR) + if not artifact_root: + raise MlflowException.invalid_parameter_value( + "--enable-workspaces requires --default-artifact-root or the " + "MLFLOW_ARTIFACT_ROOT environment variable to be set." + )This keeps behavior aligned with the documented contract while still enforcing a non‑null artifact root when workspaces are enabled.
mlflow/server/otel_api.py (1)
59-125: Fix Content-Type validation and align response payload with Content-Type headerThe code has two confirmed issues:
Content-Type check too strict (line 96): The exact match
if content_type != "application/x-protobuf"rejects valid OTLP clients that send parameters (e.g.,application/x-protobuf; charset=utf-8) or case variations. Following the pattern inmlflow/server/validation.py, split on;and normalize case.Response Content-Type mismatch (line 103): The code sets
response.headers["Content-Type"] = "application/x-protobuf"but returns a PydanticOTelExportTraceServiceResponse()instance. FastAPI automatically serializes this to JSON, not protobuf. The actual response body will be JSON (e.g.,{"partialSuccess": null}), not protobuf-encoded data.Recommend either:
- Parse Content-Type header correctly and return actual protobuf-encoded
ExportTraceServiceResponsefromopentelemetry.proto.collector.trace.v1.trace_service_pb2, OR- Change response Content-Type to
application/jsonif returning JSON is intentionalmlflow/store/tracking/sqlalchemy_store.py (1)
648-710:_get_experimentdefault resolution per workspace is correct but may surprise legacy clientsThe new logic:
- Uses the active workspace when
workspaceis not provided.- For
experiment_id is Noneand workspaces enabled, resolves (and lazily bootstraps) the default experiment by name within the current workspace, raising a clear error if creation fails.- For explicit
experiment_id, always filters by bothexperiment_idandworkspace.This is the right behavior for multi-tenancy. As with the earlier comment, the workspace filter assumes the
workspacecolumn is populated for all experiments, so make sure migrations keep that invariant; otherwise old experiments with NULL workspace will be invisible here.mlflow/server/auth/__init__.py (1)
820-835:filter_experiment_idsshould use_has_experiment_read_accessfor workspace-aware filteringThis function currently bypasses workspace permission checks that are already implemented in
_has_experiment_read_access. Other search filters likefilter_search_experimentsandfilter_search_logged_modelsuse this function to respect workspace isolation, butfilter_experiment_idsonly checks explicit experiment permissions andauth_config.default_permission.When workspaces are enabled, this creates inconsistent filtering behavior: datasets and other resources filtered through
filter_experiment_idswill not respect workspace-level access controls, while search operations will.Refactor to call
_has_experiment_read_accessfor each experiment ID (lines 810–835), passing theauthobject and workspace permission getter, to ensure consistent workspace isolation.
🧹 Nitpick comments (53)
tests/server/test_workspace_handlers.py (5)
41-55: Consider consistent type hints in test doubles.The
DummyTrackingStoreuses type hints forcreated_namesand_experimentsbut not for method signatures. For better maintainability, consider adding type hints to all methods or removing them entirely from the test double.
82-108: Reduce duplication by extracting common test fixtures.The
DummyWorkspaceStoreandDummyTrackingStoreimplementations are repeated across multiple tests with only minor variations. Consider extracting these into reusable fixtures with configurable behavior to improve maintainability.For example, create a configurable fixture that accepts parameters for different behaviors:
@pytest.fixture def dummy_workspace_store(): class DummyWorkspaceStore: def __init__(self): self.deleted = False self.created = [] def create_workspace(self, workspace): self.created.append(workspace.name) return workspace def delete_workspace(self, workspace_name): self.deleted = True return DummyWorkspaceStore()Also applies to: 154-186, 208-239
172-174: Fragile filter string parsing in test doubles.The substring check
"name !=" in filter_stringis brittle and implicit. Consider explicitly checking for the expected filter string value or documenting the assumption to make test expectations clearer.For example:
- if filter_string and "name !=" in filter_string: + # Filter out default experiment when searching for non-default experiments + if filter_string == f"name != '{Experiment.DEFAULT_EXPERIMENT_NAME}'": return []Also applies to: 226-228
31-31: Consider extracting environment setup to a fixture.The
MLFLOW_ENABLE_WORKSPACESenvironment variable is set identically in all tests. Consider extracting this to a shared fixture to reduce duplication:@pytest.fixture def enable_workspaces(monkeypatch): monkeypatch.setenv(MLFLOW_ENABLE_WORKSPACES.name, "true")Then use it in tests:
def test_create_workspace_handler_creates_default_experiment(monkeypatch, enable_workspaces):Also applies to: 80-80, 128-128, 152-152, 206-206, 256-256
145-148: Consider verifying workspace context cleanup after teardown.The test calls
workspace_teardown_request_handlerbut doesn't verify that the workspace context is properly cleared. Consider adding an assertion after teardown to ensure cleanup occurred:handlers.workspace_teardown_request_handler(None) + assert workspace_context.get_current_workspace() is Nonemlflow/server/auth/config.py (1)
14-14: LGTM! Secure default will be applied.The new boolean field is correctly typed and positioned. The secure default of
False(applied in the reading logic below) follows the deny-by-default principle for access control.Consider adding a docstring to the
AuthConfigclass or an inline comment explaining what this configuration option controls, for example:class AuthConfig(NamedTuple): default_permission: str database_uri: str admin_username: str admin_password: str authorization_function: str + # Controls whether users are automatically granted access to the default workspace grant_default_workspace_access: booltests/tracking/test_rest_tracking.py (3)
1672-1873: Extended non-local model version source coverage looks correctThe extended
test_create_model_version_with_non_local_source:
- Exercises a broad set of valid
mlflow-artifacts:forms (various trailing slashes, host variants, multi-dot paths),- Asserts appropriate 400s for path-traversal-like and encoded traversal attempts across multiple schemes,
- Adds
store_type == "sqlalchemy"coverage for logged-model–basedmodel_idsources vs disallowedfile://sources.The structure and expectations are consistent with the intended security model for model version sources; no issues spotted. If duplication ever becomes a concern, you could factor the repeated
requests.post+ assertion patterns into small helpers, but that’s non-essential.
3429-3451: URL logging workspace behavior is well-covered (minor brittleness on private attribute)The two new tests correctly distinguish:
- Legacy
/#/experiments/{exp_id}URLs when no workspace context is set, even if_supports_workspacesisTrue.- Workspace URLs including
/#/workspaces/{workspace}/experiments/{exp_id}when aWorkspaceContextis active.The only minor brittleness is relying on the private
_supports_workspacesattribute; if that internal flag is renamed or removed, these tests will break even if external behavior is unchanged. That’s acceptable for internal tests but something to keep in mind.
4107-4591: REST workspace CRUD, scoping, and legacy endpoint tests are thorough; consider broadening caught exceptionsThe new suite of REST workspace tests (CRUD, tracking, runs, artifacts, logged models, registry, default workspace, and legacy endpoints) provides very strong coverage of isolation semantics between workspaces and the default context. The patterns of:
- Creating resources under multiple workspaces,
- Asserting cross-workspace visibility failures (via exceptions or empty lists),
- Verifying state in the original workspace after attempted cross-workspace operations,
are all sound and align with expected multi-tenant behavior.
One small robustness tweak:
- In
test_rest_run_operations_are_workspace_scoped,_call_safelyonly catchesMlflowException, but the underlying client calls may raiseRestException(or another exception type) on cross-workspace access, depending on how the client wraps errors. If you intend these calls to be treated as “expected failures” regardless of the precise exception subtype, it would be safer to broaden the catch:- def _call_safely(fn): - try: - fn() - except MlflowException: - pass + def _call_safely(fn): + try: + fn() + except (MlflowException, RestException): + # Cross-workspace access should fail; ignore the exact error shape + passThis keeps the negative tests focused on state invariants without being sensitive to the exact exception class.
docs/api_reference/source/rest-api.rst (1)
62-67: Align error code formatting with rest of docsThe new guidance for workspaces under
artifact_locationlooks good. For consistency with the rest of this page (e.g.,RESOURCE_ALREADY_EXISTS), consider formatting the error code as an inline literal:server uses the workspace's default artifact location and returns ``INVALID_PARAMETER_VALUE`` if provided.mlflow/utils/rest_utils.py (1)
23-24: Workspace header resolution and injection are well-structured; consider minor hardeningThe
_resolve_active_workspacehelper cleanly prefers the tracking workspace context and falls back toMLFLOW_WORKSPACE, trimming empty strings to avoid sending meaningless headers._should_include_workspace_header’s path check to skip/mlflow/workspacesadmin endpoints makes sense, and usingheaders.setdefault(WORKSPACE_HEADER_NAME, workspace)correctly preserves any explicitly provided workspace header (e.g., inextra_headers). One small hardening you might consider is explicitly asserting or coercing non‑string workspace values (shouldget_current_workspace()ever evolve to return something structured) to avoid accidentally sending non‑string objects as header values.Also applies to: 44-45, 62-93, 157-160
tests/store/tracking/test_sqlalchemy_store.py (1)
7791-7807: Consider also asserting deleted model remains non‑retrievable after taggingThe test correctly verifies that tags written after
delete_logged_modelare persisted at theSqlLoggedModelTaglevel. For symmetry withtest_log_logged_model_params_on_deleted_model, you might also assert thatstore.get_logged_model(model.model_id)still raises (i.e., tagging does not resurrect the model). This would tighten the contract without changing behavior.For example:
with pytest.raises(MlflowException, match="not found"): store.get_logged_model(model.model_id)mlflow/entities/evaluation_dataset.py (1)
326-347: Hardento_dferror-code matching for dataset-not-found cases
to_dfcurrently interprets a missing dataset by checking:error_code = getattr(e, "error_code", None) if error_code == ErrorCode.Name(RESOURCE_DOES_NOT_EXIST): ...This assumes
e.error_codeis always the string name"RESOURCE_DOES_NOT_EXIST". If any store surfaces the numeric enum value instead, this branch would be skipped and the exception re-raised.To tolerate both representations, consider:
- except MlflowException as e: - error_code = getattr(e, "error_code", None) - if error_code == ErrorCode.Name(RESOURCE_DOES_NOT_EXIST): + except MlflowException as e: + error_code = getattr(e, "error_code", None) + not_found_codes = { + RESOURCE_DOES_NOT_EXIST, + ErrorCode.Name(RESOURCE_DOES_NOT_EXIST), + } + if error_code in not_found_codes: # Cache the empty records list to avoid repeated store lookups self._records = [] records = [] else: raiseThis keeps behavior unchanged where the string is used, but also handles numeric codes if any backend returns them.
mlflow/store/workspace/abstract_store.py (1)
64-76: Abstract workspace store and name validation are well-structured (minor DRY opportunity)The
AbstractStoreinterface plusresolve_artifact_rootdefault(default_artifact_root, True)give providers a clear extension point without forcing per-workspace customization.WorkspaceNameValidatorcorrectly enforces length, pattern, and reserved-name checks with good error messages.If you want to reduce maintenance overhead, you could have
is_valid()delegate tovalidate()and catchMlflowException, so that future rule changes only need to be updated in one place. Not critical, but it avoids the two methods drifting apart.Also applies to: 79-128
mlflow/server/handlers.py (1)
2935-2963: Workspace‑scoped artifact paths: behavior is soundThe
_workspace_scoped_repo_pathhelper and its use in download/upload/list/delete and multipart endpoints correctly:
- Leave paths untouched when no active workspace.
- Prefix with
workspaces/<workspace>when a workspace is active, isolating proxied artifacts per tenant.- Avoid double‑prefixing when callers already pass
workspaces/….One edge case to consider:
- For
artifact_path == "workspaces"(without a trailing/),normalized.startswith("workspaces/")is false, so you'll end up withworkspaces/<ws>/workspaces. If that path is ever used externally, normalize this case:- if normalized.startswith("workspaces/"): + if normalized == "workspaces" or normalized.startswith("workspaces/"): return normalized or posixpath.join("workspaces", workspace)The multipart upload methods already accept the
artifact_pathparameter, so passingrepo_pathuses the existing interface correctly—no signature extension needed.Also applies to: 2984-2999, 3010-3016, 3033-3038
tests/tracking/test_workspace_registry.py (1)
22-27: Consider using a public cleanup method.Line 27 accesses the private
_engineattribute for cleanup. While this works for testing, it couples the test to implementation details. Consider adding a publicdispose()orclose()method to SqlAlchemyStore if one doesn't already exist.tests/tracking/test_workspace_utils.py (1)
1-38: Workspace URI resolution tests cover key precedence rulesThe autouse fixture correctly cleans both in-memory and env state, and the tests clearly exercise explicit, configured, env, and tracking-default resolution paths. The redundancy of resetting state both before and after is harmless and keeps isolation explicit.
tests/entities/model_registry/test_model_version.py (1)
7-44: ModelVersion workspace tests are thorough; avoid hardcoding the default nameThe extended
_checkhelper and new round-trip test give good coverage ofworkspaceacross construction, dict/proto conversions, and__str__. Intest_string_repr, consider usingDEFAULT_WORKSPACE_NAMEinstead of the literal'default'so the test stays aligned if the default workspace name ever changes:- "status='PENDING_REGISTRATION', status_message='Copying!', tags={}, user_id='user one', " - "version='43', workspace='default'>" + "status='PENDING_REGISTRATION', status_message='Copying!', tags={}, user_id='user one', " + f"version='43', workspace='{DEFAULT_WORKSPACE_NAME}'>"Also applies to: 88-108, 159-185, 188-214
mlflow/tracking/default_experiment/registry.py (1)
63-71: Return type annotation may limit supported Python versions
get_experiment_idis annotated as-> str | None, which relies on PEP 604 union syntax and only parses on Python 3.10+. If this branch still targets Python 3.8/3.9, this will be a syntax error. In that case, prefer the older spelling:-from typing import Optional - -def get_experiment_id() -> str | None: +from typing import Optional + +def get_experiment_id() -> Optional[str]:If you have already dropped support for <3.10, this is fine as-is, but it’s worth confirming against your supported Python matrix.
tests/helpers/db_mocks.py (1)
7-22: Session maker mock matches expected pattern; optional room for richer query mocksThis helper cleanly mimics a managed session maker and produces chainable
query.filter().order_by().first()calls that returnNone, which should be sufficient for the current tests.If future tests need additional query behaviors (e.g.,
filter_by,all, or configurablefirst()values), consider extending_mock_queryaccordingly or parameterizing the mock, but it’s not necessary for this PR.tests/entities/model_registry/test_registered_model.py (1)
5-6: Workspace coverage forRegisteredModelis solid; consider reusing the default constant in repr assertionThe additions to
_check, theas_dictfixtures, and the newtest_registered_model_non_default_workspace_round_tripgive good coverage that theworkspacefield is preserved across construction,dict(rmd),from_dictionary, and__str__.One minor polish point: in
test_string_repr, the expected string currently hard-codesworkspace='default'. To avoid drifting fromDEFAULT_WORKSPACE_NAMEif that constant ever changes, you could build the expected string with the constant, e.g. using an f-string:-from mlflow.utils.workspace_utils import DEFAULT_WORKSPACE_NAME +from mlflow.utils.workspace_utils import DEFAULT_WORKSPACE_NAME @@ - assert ( - str(rmd) == "<RegisteredModel: aliases={}, creation_timestamp=1000, " - "deployment_job_id=None, deployment_job_state=None, description='something about a model'," - " last_updated_timestamp=2002, " - "latest_versions=['1', '2', '3'], name='myname', tags={}, workspace='default'>" - ) + expected = ( + "<RegisteredModel: aliases={}, creation_timestamp=1000, " + "deployment_job_id=None, deployment_job_state=None, " + "description='something about a model', last_updated_timestamp=2002, " + "latest_versions=['1', '2', '3'], name='myname', tags={}, " + f"workspace='{DEFAULT_WORKSPACE_NAME}'>" + ) + assert str(rmd) == expectedNot required for correctness, but it keeps tests aligned with the shared constant.
Also applies to: 19-20, 30-31, 39-50, 93-104, 125-136, 152-163, 174-189, 192-212
docs/docs/self-hosting/workspaces/index.mdx (1)
1-152: Workspace docs are clear; please confirm they exactly match the current implementationThe page gives a good, structured explanation of workspaces, isolation, artifact layout, and migration behavior, and it aligns conceptually with the code changes (e.g., default workspace name,
workspaces/<workspace-name>artifact prefix).Given this is a WIP, I’d recommend a quick pass to verify that:
- The enablement mechanisms and names (
MLFLOW_ENABLE_WORKSPACES,--enable-workspaces,MLFLOW_WORKSPACE) match the actual flags / environment variables in the server code.- The “Workspace Resolution” order matches the combined behavior of
mlflow.set_workspace,MLFLOW_WORKSPACE, and anyget_default_workspace()usage (e.g., viaresolve_entity_workspace_nameandget_default_workspace_optional).This is just to avoid subtle drift between docs and behavior.
mlflow/store/artifact/http_artifact_repo.py (2)
39-59: Workspace parsing hook is consistent but currently unused; consider tightening the APIThe new
_get_workspace_scoped_artifact_uricorrectly detects a/mlflow-artifacts/artifacts/workspaces/<workspace>/…segment without changing existing behavior, and_host_credsstill uses the originalartifact_uri, so there’s no regression.Right now, though, the parsed
workspace_componentisn’t used and_get_artifacts_endpoint_prefix(_workspace_component)ignores its argument. To keep things simpler until you actually need workspace-specific endpoint prefixes, you might either:
- Drop the parameter on
_get_artifacts_endpoint_prefixfor now, or- Thread
workspace_componentthrough and implement a distinct prefix when/if the server API requires it.Not a blocker, but it will make the intent clearer to future readers.
101-105: List and multipart endpoints preserve workspace-scoped paths; small refactor opportunityThe updated
list_artifactsand_construct_mpu_uri_and_pathboth derive the base URL and “root” path by splitting on the artifacts endpoint prefix. That continues to work for both:
- Legacy URIs:
/mlflow-artifacts/artifacts/<root>- Workspace URIs:
/mlflow-artifacts/artifacts/workspaces/<workspace>/<root>since the
tail(and thusroot) simply includes theworkspaces/<workspace>/…path segment.If you want to reduce duplication and future-proof this, consider centralizing the split logic into a helper that returns
(base_url, root_path, workspace_component)and reuse it in bothlist_artifactsand_construct_mpu_uri_and_path. This would also make it easier to change the underlying endpoint layout later in one place.Also applies to: 136-140
tests/store/model_registry/test_rest_store.py (1)
161-175: Workspace not-supported error path test looks solid but is tightly coupled to internalsThe test correctly exercises the failure path when
_workspace_supportisFalseand a non-empty workspace is resolved, including checking the exception message. It does, however, depend on private details (store._workspace_supportandbase_rest_store.rest_utils._resolve_active_workspace), so it may become brittle if internal wiring changes. If the underlying implementation ever exposes a public way to simulate “no workspace support”, consider switching to that to reduce coupling.docs/docs/self-hosting/workspaces/workspace-providers.mdx (1)
1-137: Docs clearly describe workspace providers; consider aligning snippets with actual APIsThe overall explanation of provider architecture, discovery, default SQL provider behavior, artifact
for_workspace()hooks, and Flask context usage is clear and useful. To avoid confusion for users, it’s worth double-checking that:
- The
AbstractStoreandArtifactRepository.for_workspace()signatures in the snippets match the actual implementations.- The
Workspaceimport path in the example (from mlflow.entities import Workspace) aligns with the public API you expect users to use (tests here importWorkspacefrommlflow.entities.workspace).If they differ, updating the examples to mirror the real interfaces will keep the docs from drifting.
tests/store/tracking/test_rest_store.py (1)
2862-2880:log_spansversion checking and caching tests are comprehensive but one subtest is brittlePros:
test_log_spans_with_version_checkexercises:
- failure to retrieve version,
- version < 3.4 rejecting,
- 3.4 / 3.5 success paths.
test_server_version_check_cachingconfirms_get_server_versionis invoked once per host and that laterlog_spanscalls reuse the cached version (even across RestStore instances).Concern:
- The “real timeout test” (Test 5 within
test_log_spans_with_version_check) relies on an actual connection attempt tohttps://host5and asserts elapsed time< 5s. This can become flaky under slow DNS or unusual network conditions.Consider stubbing
rest_utils.http_requestwith atime.sleepto simulate timeout instead of performing a real network call; behavior would be deterministic while still validating timeout semantics.Also applies to: 2882-2947, 2949-3021
mlflow/store/workspace/rest_store.py (1)
16-18: REST workspace store CRUD looks correct; consider callingsuper().__init__()
- Endpoints and methods (
GET/POST/PATCH/DELETE /api/2.0/mlflow/workspaces[...]) and the use ofverify_rest_responseare consistent and correct.create_workspaceexplicitly maps400/409+RESOURCE_ALREADY_EXISTSintoMlflowException(..., RESOURCE_ALREADY_EXISTS), which matches MLflow’s error semantics.- Handling of empty response bodies by falling back to the request values is pragmatic and avoids extra server requirements.
get_default_workspaceexplicitly raising an invalid-parameter exception is clear about this provider’s capabilities.One small robustness improvement:
class RestWorkspaceStore(AbstractStore): """REST-backed workspace store implementation.""" def __init__(self, get_host_creds): - self.get_host_creds = get_host_creds + super().__init__() + self.get_host_creds = get_host_credsCalling
super().__init__()future-proofs the class ifAbstractStorelater adds initialization logic.Also applies to: 19-106
mlflow/server/auth/db/models.py (1)
104-125: New SqlWorkspacePermission model is straightforward; consider optional FK to users
- Composite primary key on
(workspace, username, resource_type)plus indices onusernameandworkspaceshould support common lookup patterns efficiently.to_mlflow_entity()correctly populatesWorkspacePermission.Depending on how strictly you want to enforce referential integrity between
workspace_permissions.usernameandusers.username, you might optionally add aForeignKey("users.username")and/or a relationship onSqlUser. If usernames can be non-user principals, the current design is fine as is.mlflow/server/otel_api.py (1)
1-11: Update module docstring to reflect non-placeholder implementationThe docstring still describes this as a minimal placeholder and implies span ingestion is not fully implemented, which is no longer accurate given the concrete ingestion and conversion logic below. Consider updating it to describe the actual behavior and workspace-aware handling to avoid confusion.
mlflow/cli/__init__.py (1)
587-597: Avoid duplicating workspace env setup via both os.environ and EnvironmentVariable.setYou’re setting workspace-related env in two ways:
- Directly via
os.environ[...] = ...above.- Via
MLFLOW_ENABLE_WORKSPACES.set(enable_workspaces)andMLFLOW_WORKSPACE_URI.set(...)here.Since the
EnvironmentVariablehelpers already write toos.environ, maintaining both paths adds redundancy and a chance for drift if behavior changes in one place. Consider consolidating on theEnvironmentVariablehelpers for all writes and keeping the logging/echo behavior here.mlflow/server/auth/entities.py (2)
148-151: RegisteredModelPermission workspace handling looks sound; consider explicit type-level docsThe added
workspaceparameter defaults toNone, is normalized viaresolve_entity_workspace_name, and is round-tripped throughto_json/from_jsonwith a backward-compatibledictionary.get("workspace"). That keeps existing JSON payloads and positional call sites working while making workspace explicit. It might be worth documenting (in a docstring or higher-level docs) that omittingworkspacein JSON will resolve to the default workspace so callers understand the behavior, but the implementation itself is solid.Also applies to: 155-157, 175-181, 184-190
244-288: Align WorkspacePermission workspace normalization and validation with other entities
WorkspacePermissioncurrently storesworkspaceas passed and validates required fields usingif not all([...]). For consistency and to avoid surprising falsy-value behavior:
- Consider normalizing
workspaceviaresolve_entity_workspace_name(and possibly reusingWorkspaceNameValidator) so this entity behaves likeRegisteredModelPermission.- Prefer explicit
is Nonechecks overall([...])so that values like"0"or numeric permission codes don't get rejected purely because they may be falsy in future.This keeps workspace semantics and validation consistent across auth entities.
mlflow/store/workspace/sqlalchemy_store.py (1)
49-65: create_workspace behavior and error mapping look correct; consider trimming DB error detail
create_workspacevalidates the name, persists via a managed session, and mapsIntegrityErrortoMlflowExceptionwithRESOURCE_ALREADY_EXISTS, which matches the tests and MLflow patterns. The message currently appends the raw DB exception (Error: {exc}); if you care about not exposing backend details to API consumers, you might drop that suffix and rely on logs instead, but functionally this is fine.tests/server/test_workspace_endpoints.py (1)
59-72: Strengthen endpoint tests by asserting store call argumentsThe endpoint tests currently verify status codes and response payloads and only assert that
create_workspace,update_workspace, anddelete_workspacewere called once. To better lock in the handler contract, consider asserting the arguments as well, for example:
- For create: inspect
mock_workspace_store.create_workspace.call_argsand assert theWorkspacepassed has the expectednameanddescription.- For update: similarly validate the
Workspaceargument (or separate name/description parameters if you change the store signature).- For delete: you already assert
"team-e"is passed, which is good; mirroring this pattern for create/update would improve coverage.This would catch regressions where request JSON isn’t correctly mapped into the workspace entity.
Also applies to: 85-97, 100-105
mlflow/tracking/_workspace/fluent.py (1)
36-50: set_workspace behavior is reasonable; consider marking it experimental like other public APIsThe
set_workspaceimplementation is thread-safe, correctly handlesNoneby clearing both in-process context andMLFLOW_WORKSPACE, and validates non-default names withWorkspaceNameValidator. The CRUD helpers are consistently funneled through_workspace_client_calland apply name validation except for the reserved default, which aligns with store semantics.One minor consistency point:
set_workspaceis exported via__all__but is not annotated with@experimental(version="3.7.0")like the other public functions in this module. If the intention is for all newly introduced workspace APIs to be experimental, you may want to add the decorator toset_workspaceas well; otherwise, it effectively appears more stable than the others.Also applies to: 52-94, 96-104
tests/store/model_registry/test_sqlalchemy_store_workspace.py (1)
36-105: Consider asserting error codes for all cross‑workspace failure pathsFor many cross‑workspace operations you assert both the error message and
error_code == "RESOURCE_DOES_NOT_EXIST", but a few (e.g.,rename_registered_model/delete_registered_modelin the"team-b"context, and some metadata operations) only assert on the message. For long‑term stability, you may want to assert the error code in those places as well so behavior remains consistent even if messages change.Also applies to: 141-173, 228-252
mlflow/tracking/client.py (1)
1872-1885: Optional: enforceartifact_locationrestriction when workspaces are enabledThe
create_experimentdocstring now states thatartifact_locationcannot be specified when workspaces are enabled because a workspace‑scoped default is always used, but the method still passes the argument through unconditionally to the tracking store. If you want failures to be more immediate and client‑side, consider adding a guard like:def create_experiment( self, name: str, artifact_location: str | None = None, tags: dict[str, Any] | None = None, ) -> str: @@ - """ - Create an experiment. + """ + Create an experiment. @@ - """ - return self._tracking_client.create_experiment(name, artifact_location, tags) + """ + # When workspaces are enabled, artifact locations are determined by the workspace. + if workspace_utils.workspaces_enabled() and artifact_location is not None: + raise MlflowException( + "artifact_location cannot be specified when workspaces are enabled; " + "a workspace-scoped default artifact location is always used.", + error_code=INVALID_PARAMETER_VALUE, + ) + return self._tracking_client.create_experiment(name, artifact_location, tags)(Adjust helper / error code usage to match
workspace_utilsand existing validation utilities.) If you prefer to keep this server‑side only, the current implementation is still consistent with the updated documentation.Also applies to: 1924-1925
mlflow/tracking/_workspace/registry.py (1)
37-47: Optional: consider broadening entrypoint error handling or logging detail
register_entrypoints()currently catches onlyAttributeErrorandImportErrorwhen loading workspace provider plugins, emitting a warning. If a provider raises another exception type during load (e.g., misconfiguration), initialization will fail hard. Depending on how defensive you want plugin loading to be, you might consider catchingExceptionhere and including provider name and exception type in the warning to avoid taking down the process due to a single bad plugin.mlflow/store/tracking/sqlalchemy_store.py (8)
449-527: Artifact location derivation and isolation checks are well thought out
_get_artifact_locationnow parameterizes onworkspaceand correctly:
- Uses the provider’s resolved root when workspaces are enabled, with an optional
/workspaces/<name>prefix whenshould_appendis True.- Falls back to
artifact_root_uriwhen the provider doesn’t implement resolution or returnsNone.
_validate_artifact_isolation_constraintsand the reserved prefix checks around/workspacesare a sensible guardrail to prevent enabling workspaces on top of conflicting pre-existing artifact layouts. The startup scan also skips heavy checks when non-default workspaces already exist, which is a nice optimization for already-migrated installs.
528-548: Single-tenant guard_ensure_no_non_default_workspace_experimentsis correct but message could mention workspace flagThe method properly fails fast if any experiment has
workspace != DEFAULT_WORKSPACE_NAME, preventing silently running in single-tenant mode on a DB that already has multi-workspace data. You might consider explicitly mentioning theMLFLOW_ENABLE_WORKSPACES(or equivalent) flag in the error message to make remediation clearer to operators, but the current wording is already actionable.
891-921:_get_run_inputsworkspace scoping is sound but always uses the active workspaceJoining through
SqlExperimentand filteringSqlExperiment.workspace == workspacein_get_run_inputsensures dataset inputs on runs in other workspaces are not exposed. Note that this usesself._get_active_workspace()rather than the workspace bound to any outer_workspace_session. That’s okay today (workspace context is global), but if you ever introduce cross-workspace administrative views, you might want to thread an explicitworkspaceargument instead of re-reading global context.
1424-1458: Bulk metric history properly filters run_ids by workspace when enabledIn
get_metric_history_bulk, theallowed_run_idsderivation via_runs_queryunder_workspaces_enabled()ensures that only runs in the active workspace contribute metrics. The early return whenallowed_run_idsis empty is correct and avoids unnecessary work. This does add an extra query in workspace mode, but it’s acceptable given the isolation requirements.
1976-2109:_log_inputs_implenforces dataset/run workspace consistency but has a minor key inconsistencyUsing
workspace = self._get_active_workspace()up front and then:
- Joining
SqlDataset→SqlExperimentand filteringSqlExperiment.workspace == workspaceto detect existing datasets.- Storing new
SqlDatasetrows with the callingexperiment_id.- Creating
SqlInputedges without workspace columns (sincedataset_idandrun_idare already scoped)is a solid pattern for multi-tenancy.
Minor nit:
input_uuidsis keyed as(source_id, destination_id)for existing inputs but uses(dataset_input.dataset.name, dataset_input.dataset.digest)when populating new entries. Since membership checks later use(dataset_uuid, run_id), the keys for new entries are never read and are effectively dead bookkeeping. It doesn’t break correctness (datasets are already deduped by(name,digest)), but aligning the key shape would improve clarity.- for existing_input in existing_inputs: - input_uuids[(existing_input.source_id, existing_input.destination_id)] = ( - existing_input.input_uuid - ) + for existing_input in existing_inputs: + input_uuids[(existing_input.source_id, existing_input.destination_id)] = ( + existing_input.input_uuid + ) ... - if (dataset_uuid, run_id) not in input_uuids: + if (dataset_uuid, run_id) not in input_uuids: new_input_uuid = uuid.uuid4().hex - input_uuids[(dataset_input.dataset.name, dataset_input.dataset.digest)] = ( - new_input_uuid - ) + input_uuids[(dataset_uuid, run_id)] = new_input_uuid
2127-2208: Model input/output helpers correctly adopt workspace-aware run filtering
_get_model_inputsand_get_model_outputs_bulkboth:
- Reuse
_runs_queryunder_workspaces_enabled()when deriving allowed run_ids.- Fall back to identity mappings for missing run_ids to preserve API shape.
The recursion in
_get_model_inputswhensession is Noneis simple and safe, but if you ever extend the method, consider splitting the internal implementation into a non-recursive private helper (e.g._get_model_inputs_in_session) for clarity.
4575-4657:upsert_dataset_recordscorrectly reuses dataset.workspace in the final updateAfter using
_get_dataset_recordto fetchdatasetwithin the correct workspace, the finalupdateonSqlEvaluationDatasetincludes bothdataset_idanddataset.workspacein the filter. Whiledataset_idis already globally unique (UUID-derived), this extra predicate is harmless and future-proofs against any accidental reuse patterns. The schema/profile recomputation is entirely local to this dataset.
4671-4687:get_dataset_experiment_idsis workspace-safe but could optionally filter experimentsWhen workspaces are enabled, the upfront
_get_dataset_recordcall inside a managed session is a good guard; returning[]when it fails avoids leaking associations for datasets that belong to other workspaces. The subsequentsearch_entities_by_sourcecall is global, but becausedataset_idis globally unique and associations are only created for experiments validated in the same workspace, this is effectively safe. If you ever relax those invariants, consider adding a join toSqlExperimentwith a workspace filter here as well.mlflow/server/auth/sqlalchemy_store.py (2)
35-47:_WORKSPACE_RESOURCE_TYPESand validation guard against unsupported resource typesDefining
_WORKSPACE_RESOURCE_TYPES = {"*", "experiments", "registered_models"}and enforcing it in_validate_workspace_resource_type()ensures you don’t accidentally persist or reason about undefined resource types. The error message enumerating valid types should be very helpful at API boundaries.
389-407:list_accessible_workspace_namescorrectly aggregates readable workspaces via wildcard supportGiven a username:
- It pulls all
SqlWorkspacePermissionrows for[username, "*"].- Validates each
permissionand usesget_permission(permission).can_readto decide whether the workspace is accessible.- Returns a deduplicated
set[str]of workspace names.This honors wildcard-user permissions and any resource type (since any readable resource implies the workspace is “accessible”), which is a reasonable definition for workspace listing.
mlflow/server/auth/__init__.py (4)
185-195: AuthContext helper is simple and safe; consider documenting expected attributesUsing
AuthContextas a thin wrapper overAuthorizationis fine and makes later code cleaner. Since you depend onauth.usernameand sometimesauth.is_admin, a short docstring onAuthContext(and/or on the expected shape ofAuthorization) would help future custom auth implementations know which attributes they must provide.Please double-check any custom
authorization_functionyou may have configured still returns an object exposing ausernameattribute, as previously assumed.
505-613: Experiment / run / model / scorer permission getters correctly pass workspace fallbacksThe updated
_get_permission_from_*helpers now:
- Authenticate once per call and derive
AuthContext.- Call
_get_permission_from_store_or_defaultwith aworkspace_fallbackpointing at the appropriate_workspace_permission_for_*helper.
That wiring looks consistent across experiments, runs, logged models, registered models, and scorers.To avoid repeated
authenticate_request()calls and redundantAuthContextconstruction across these helpers, consider a small internal helper that takes(experiment_id | model_name, resource_type)and returns(auth, auth_ctx)plus a boundworkspace_fallback.Please re-run the auth regression tests for all affected endpoints (experiment / run / logged model / registered model / scorer) to validate there are no behavioral differences when
MLFLOW_ENABLE_WORKSPACESis unset.
711-748: Creation of experiments and registered models is now workspace-scoped
validate_can_create_experiment/validate_can_create_registered_model:
- Preserve the historical “always allowed” behavior when workspaces are disabled.
- In workspace mode, require either admin or
MANAGEpermission in the current workspace.- Return
Falsewhen no active workspace is set.
This is a good tightening of semantics for multi-tenant deployments.You might want to log a debug-level message when
workspace_name is Nonein workspace mode, to make misconfigured clients (not setting a workspace) easier to diagnose.Please ensure client-side workspace selection helpers (e.g.,
mlflow.set_workspace/MlflowClient.set_workspace) are documented as required before calling creation APIs in workspace mode.
1410-1465:filter_list_workspacessafely handles malformed responses and enforces per-user visibilityThe response filter:
- Short-circuits for admins.
- Gracefully handles JSON decoding failures and malformed
workspacespayloads by returning an empty list instead of erroring.- Uses
list_accessible_workspace_namesplusgrant_default_workspace_accessand default workspace resolution to compute allowed names.- Never exposes workspaces if the auth store claims to support workspaces but lacks permission data (logs a warning and returns none).
This is a good defense-in-depth design.You might consider logging the total number of filtered-out workspaces at debug level to aid troubleshooting in multi-tenant environments.
Please verify that
list_accessible_workspace_namesis efficient enough to be called on every list-workspaces response, especially in environments with many workspaces.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (103)
.github/workflows/master.yml(1 hunks)docs/api_reference/api_inventory.txt(14 hunks)docs/api_reference/source/rest-api.rst(1 hunks)docs/docs/self-hosting/architecture/overview.mdx(1 hunks)docs/docs/self-hosting/index.mdx(1 hunks)docs/docs/self-hosting/workspaces/configuration.mdx(1 hunks)docs/docs/self-hosting/workspaces/getting-started.mdx(1 hunks)docs/docs/self-hosting/workspaces/index.mdx(1 hunks)docs/docs/self-hosting/workspaces/permissions.mdx(1 hunks)docs/docs/self-hosting/workspaces/workspace-providers.mdx(1 hunks)docs/sidebarsSelfHosting.ts(1 hunks)mlflow/__init__.py(5 hunks)mlflow/cli/__init__.py(6 hunks)mlflow/entities/evaluation_dataset.py(7 hunks)mlflow/entities/experiment.py(6 hunks)mlflow/entities/model_registry/model_version.py(4 hunks)mlflow/entities/model_registry/registered_model.py(6 hunks)mlflow/entities/webhook.py(7 hunks)mlflow/entities/workspace.py(1 hunks)mlflow/environment_variables.py(3 hunks)mlflow/models/model.py(2 hunks)mlflow/server/__init__.py(4 hunks)mlflow/server/auth/__init__.py(35 hunks)mlflow/server/auth/basic_auth.ini(1 hunks)mlflow/server/auth/client.py(4 hunks)mlflow/server/auth/config.py(2 hunks)mlflow/server/auth/db/migrations/versions/2ed73881770d_workspace_permissions.py(1 hunks)mlflow/server/auth/db/models.py(4 hunks)mlflow/server/auth/db/utils.py(0 hunks)mlflow/server/auth/entities.py(5 hunks)mlflow/server/auth/routes.py(1 hunks)mlflow/server/auth/sqlalchemy_store.py(6 hunks)mlflow/server/handlers.py(21 hunks)mlflow/server/otel_api.py(6 hunks)mlflow/server/workspace_helpers.py(1 hunks)mlflow/store/artifact/artifact_repo.py(2 hunks)mlflow/store/artifact/artifact_repository_registry.py(2 hunks)mlflow/store/artifact/http_artifact_repo.py(4 hunks)mlflow/store/db/utils.py(2 hunks)mlflow/store/model_registry/abstract_store.py(1 hunks)mlflow/store/model_registry/base_rest_store.py(3 hunks)mlflow/store/model_registry/dbmodels/models.py(7 hunks)mlflow/store/model_registry/sqlalchemy_store.py(37 hunks)mlflow/store/tracking/abstract_store.py(1 hunks)mlflow/store/tracking/dbmodels/models.py(6 hunks)mlflow/store/tracking/rest_store.py(5 hunks)mlflow/store/tracking/sqlalchemy_store.py(89 hunks)mlflow/store/workspace/__init__.py(1 hunks)mlflow/store/workspace/abstract_store.py(1 hunks)mlflow/store/workspace/dbmodels/models.py(1 hunks)mlflow/store/workspace/rest_store.py(1 hunks)mlflow/store/workspace/sqlalchemy_store.py(1 hunks)mlflow/store/workspace/utils.py(1 hunks)mlflow/store/workspace_aware_mixin.py(1 hunks)mlflow/tracking/_tracking_service/client.py(6 hunks)mlflow/tracking/_workspace/__init__.py(1 hunks)mlflow/tracking/_workspace/client.py(1 hunks)mlflow/tracking/_workspace/context.py(1 hunks)mlflow/tracking/_workspace/fluent.py(1 hunks)mlflow/tracking/_workspace/registry.py(1 hunks)mlflow/tracking/_workspace/utils.py(1 hunks)mlflow/tracking/client.py(6 hunks)mlflow/tracking/default_experiment/registry.py(2 hunks)mlflow/tracking/fluent.py(2 hunks)mlflow/utils/rest_utils.py(6 hunks)mlflow/utils/server_cli_utils.py(1 hunks)mlflow/utils/workspace_utils.py(1 hunks)tests/entities/model_registry/test_model_version.py(5 hunks)tests/entities/model_registry/test_registered_model.py(8 hunks)tests/entities/test_evaluation_dataset.py(1 hunks)tests/entities/test_experiment.py(4 hunks)tests/entities/test_webhook.py(2 hunks)tests/genai/datasets/test_fluent.py(1 hunks)tests/helpers/__init__.py(1 hunks)tests/helpers/db_mocks.py(1 hunks)tests/server/auth/db/test_cli.py(3 hunks)tests/server/auth/test_auth.py(6 hunks)tests/server/auth/test_client.py(5 hunks)tests/server/auth/test_sqlalchemy_store.py(9 hunks)tests/server/test_handlers.py(3 hunks)tests/server/test_init.py(3 hunks)tests/server/test_otel_api.py(1 hunks)tests/server/test_workspace_endpoints.py(1 hunks)tests/server/test_workspace_handlers.py(1 hunks)tests/store/model_registry/test_rest_store.py(1 hunks)tests/store/model_registry/test_sqlalchemy_store.py(2 hunks)tests/store/model_registry/test_sqlalchemy_store_workspace.py(1 hunks)tests/store/tracking/test_rest_store.py(12 hunks)tests/store/tracking/test_sqlalchemy_store.py(3 hunks)tests/store/tracking/test_sqlalchemy_store_workspace.py(1 hunks)tests/store/workspace/test_rest_store.py(1 hunks)tests/store/workspace/test_sqlalchemy_store.py(1 hunks)tests/store/workspace/test_workspace_validator.py(1 hunks)tests/test_cli.py(29 hunks)tests/tracking/_model_registry/test_utils.py(2 hunks)tests/tracking/_tracking_service/test_utils.py(3 hunks)tests/tracking/test_client_workspace.py(1 hunks)tests/tracking/test_mlflow_artifacts.py(1 hunks)tests/tracking/test_rest_tracking.py(13 hunks)tests/tracking/test_workspace_registry.py(1 hunks)tests/tracking/test_workspace_utils.py(1 hunks)tests/utils/test_rest_utils.py(3 hunks)tests/utils/test_workspace_utils.py(1 hunks)
💤 Files with no reviewable changes (1)
- mlflow/server/auth/db/utils.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
- GitHub Check: r
- GitHub Check: build
- GitHub Check: pyfunc (4)
- GitHub Check: pyfunc (3)
- GitHub Check: protect
- GitHub Check: java
- GitHub Check: windows (3)
- GitHub Check: windows (1)
- GitHub Check: windows (4)
- GitHub Check: windows (2)
- GitHub Check: pyfunc (2)
- GitHub Check: pyfunc (1)
- GitHub Check: flavors
- GitHub Check: evaluate (1)
- GitHub Check: evaluate (2)
- GitHub Check: typescript-sdk (22)
- GitHub Check: genai
- GitHub Check: models (1)
- GitHub Check: models (2)
- GitHub Check: database
- GitHub Check: build (tracing)
- GitHub Check: python-skinny
- GitHub Check: build (skinny)
- GitHub Check: python (2)
- GitHub Check: python (3)
- GitHub Check: python (1)
- GitHub Check: build (dev)
| if enable_workspaces: | ||
| os.environ[MLFLOW_ENABLE_WORKSPACES.name] = "true" | ||
| if workspace_store_uri: | ||
| os.environ[MLFLOW_WORKSPACE_URI.name] = workspace_store_uri | ||
| elif workspace_store_uri: | ||
| click.echo( | ||
| "Ignoring --workspace-store-uri because workspaces are not enabled. " | ||
| "Use --enable-workspaces to activate workspace mode.", | ||
| err=True, | ||
| ) | ||
|
|
There was a problem hiding this comment.
Clarify behavior when workspace URI is provided but workspaces are disabled
Currently, when enable_workspaces is false and workspace_store_uri is set, you:
- Emit a user-facing
click.echosaying--workspace-store-uriis being ignored. - Later still pass
workspace_store_uriintoinitialize_backend_stores(see below).
This can be confusing, since the message implies the URI is unused. If the intention is to only use workspace_store_uri when workspaces are enabled, consider omitting it from initialize_backend_stores in this branch. If it is still used for some internal wiring even with workspaces disabled, the message should be softened to something like “Workspaces are disabled; the workspace store will not be used for workspace routing” rather than “ignoring”.
🤖 Prompt for AI Agents
In mlflow/cli/__init__.py around lines 553 to 563, the CLI echoes that
--workspace-store-uri is being ignored when workspaces are disabled but later
still passes workspace_store_uri into initialize_backend_stores, which is
confusing; update the behavior to be consistent by either (A) not passing
workspace_store_uri into initialize_backend_stores when enable_workspaces is
false (so the value is truly ignored), or (B) if the URI is required for
internal wiring even with workspaces off, change the echo message to clearly
state that workspaces are disabled and the workspace store will not be used for
workspace routing while it may still be used internally; implement whichever
option fits the intended design and adjust the conditional logic and message
accordingly.
| def from_proto(cls, proto): | ||
| # input: mlflow.protos.model_registry_pb2.RegisteredModel | ||
| # returns RegisteredModel entity | ||
| # Workspace is intentionally derived from the request context (falling back to the active | ||
| # workspace resolver) and therefore is not persisted in the ProtoRegisteredModel. | ||
| registered_model = cls( | ||
| proto.name, | ||
| proto.creation_timestamp, | ||
| proto.last_updated_timestamp, | ||
| proto.description, | ||
| [ModelVersion.from_proto(mvd) for mvd in proto.latest_versions], | ||
| workspace=None, | ||
| ) |
There was a problem hiding this comment.
Fix from_proto argument ordering so workspace is correctly set
In from_proto, the constructor is currently called as:
registered_model = cls(
proto.name,
proto.creation_timestamp,
proto.last_updated_timestamp,
proto.description,
[ModelVersion.from_proto(mvd) for mvd in proto.latest_versions],
workspace=None,
)Because workspace=None is passed positionally, it is bound to the tags parameter instead of the workspace parameter. Right now this happens to behave the same as omitting tags (they both end up as None), but it’s misleading and fragile if the signature changes.
Prefer an explicit keyword for workspace (and other non-leading args if desired), e.g.:
- registered_model = cls(
- proto.name,
- proto.creation_timestamp,
- proto.last_updated_timestamp,
- proto.description,
- [ModelVersion.from_proto(mvd) for mvd in proto.latest_versions],
- workspace=None,
- )
+ registered_model = cls(
+ name=proto.name,
+ creation_timestamp=proto.creation_timestamp,
+ last_updated_timestamp=proto.last_updated_timestamp,
+ description=proto.description,
+ latest_versions=[ModelVersion.from_proto(mvd) for mvd in proto.latest_versions],
+ workspace=None,
+ )This keeps the intent clear and ensures workspace is wired correctly.
🤖 Prompt for AI Agents
In mlflow/entities/model_registry/registered_model.py around lines 138 to 150,
the call to cls(...) passes workspace=None positionally which binds it to the
tags parameter; change the call to pass workspace explicitly as a keyword (e.g.
workspace=None) so workspace is guaranteed to be set correctly (you may also
pass tags explicitly if needed) and avoid positional ambiguity if the
constructor signature changes.
| #: Specifies the active workspace for client operations. | ||
| #: (default: ``None``) | ||
| MLFLOW_WORKSPACE = _EnvironmentVariable("MLFLOW_WORKSPACE", str, None) | ||
|
|
||
| #: Specifies the workspace provider URI. Falls back to the tracking URI when unset. | ||
| #: (default: ``None``) | ||
| MLFLOW_WORKSPACE_URI = _EnvironmentVariable("MLFLOW_WORKSPACE_URI", str, None) | ||
|
|
||
| #: Specifies the registry URI. | ||
| #: (default: ``None``) | ||
| MLFLOW_REGISTRY_URI = _EnvironmentVariable("MLFLOW_REGISTRY_URI", str, None) | ||
|
|
||
| #: Specifies the workspace provider backend URI. | ||
| #: Defaults to the tracking URI when unset. | ||
| MLFLOW_WORKSPACE_URI = _EnvironmentVariable("MLFLOW_WORKSPACE_URI", str, None) | ||
|
|
There was a problem hiding this comment.
Remove duplicate definitions of workspace env vars
MLFLOW_WORKSPACE_URI and MLFLOW_ENABLE_WORKSPACES are each defined twice in this module (lines 109–112 & 117–120, and 450–453 & 1143–1146). While Python will just overwrite the first binding, this duplication is confusing and makes future edits easy to get wrong.
Consider keeping a single definition for each and deleting the earlier or later duplicate, keeping the clearer docstring.
-#: Specifies the workspace provider backend URI.
-#: Defaults to the tracking URI when unset.
-MLFLOW_WORKSPACE_URI = _EnvironmentVariable("MLFLOW_WORKSPACE_URI", str, None)
+#: Specifies the workspace provider backend URI.
+#: Defaults to the tracking URI when unset.
+MLFLOW_WORKSPACE_URI = _EnvironmentVariable("MLFLOW_WORKSPACE_URI", str, None)
@@
-#: Enables workspace-aware multi-tenancy features on the MLflow server.
-#: (default: ``False``)
-MLFLOW_ENABLE_WORKSPACES = _BooleanEnvironmentVariable("MLFLOW_ENABLE_WORKSPACES", False)
+#: Enables workspace-aware (multi-tenant) behavior for MLflow servers and clients.
+#: (default: ``False``)
+MLFLOW_ENABLE_WORKSPACES = _BooleanEnvironmentVariable("MLFLOW_ENABLE_WORKSPACES", False)Pick one location for each env var (e.g., near the other tracking/registry envs) and delete the other.
Also applies to: 450-453, 1143-1146
🤖 Prompt for AI Agents
In mlflow/environment_variables.py around lines 105-120 (and also review
duplicates at 450-453 and 1143-1146), there are duplicate definitions for
MLFLOW_WORKSPACE_URI and MLFLOW_ENABLE_WORKSPACES; remove the redundant
definitions so each environment variable is defined only once, keep the version
with the clearer docstring placed near other tracking/registry env vars, delete
the earlier/later duplicate occurrence(s), and run a quick search to ensure no
other duplicate bindings remain in the file.
| # When true, users inherit default_permission for the reserved 'default' workspace. | ||
| grant_default_workspace_access = false |
There was a problem hiding this comment.
🧩 Analysis chain
Verify the new configuration is properly wired and documented.
The new grant_default_workspace_access configuration option is added with a conservative false default, which appears appropriate for backward compatibility. However, ensure the following before merging:
- Downstream code in the authorization/workspace logic properly reads and honors this configuration flag.
- This configuration change is documented (changelog, config guide, migration notes if needed).
- The
falsedefault aligns with the multi-tenancy security model and doesn't require updates for existing deployments.
Run the following script to verify that the configuration key is being read and used consistently:
🏁 Script executed:
#!/bin/bash
# Description: Verify that grant_default_workspace_access is wired into the authorization logic.
# Search for references to this config key in the codebase
rg -n 'grant_default_workspace_access' --type=py
# Search for where basic_auth.ini is parsed to ensure the new key is handled
rg -n 'basic_auth\.ini' --type=py -A 5 -B 5Length of output: 2211
---
🏁 Script executed:
```shell
#!/bin/bash
# Verify usage context of grant_default_workspace_access in authorization logic
# Check how grant_default_workspace_access is used in __init__.py
rg -n 'grant_default_workspace_access' mlflow/server/auth/__init__.py -A 3 -B 3
# Check for changelog/documentation files
fd -e md -e rst -e txt . -type f | grep -i -E '(changelog|migration|release|doc|guide)' | head -20
# Search for references to default_permission and workspace in docs
rg -n 'default_permission|default.*workspace' --type=md --type=txt --type=rst 2>/dev/null | head -20
Length of output: 1587
🏁 Script executed:
#!/bin/bash
# Check if there are any CHANGELOG or migration docs
find . -maxdepth 3 -type f \( -name "CHANGELOG*" -o -name "HISTORY*" -o -name "MIGRATION*" \) 2>/dev/null | head -10Length of output: 14
---
🏁 Script executed:
```shell
#!/bin/bash
# Check CHANGELOG.md for documentation of this feature
rg -n 'grant_default_workspace_access|default workspace|workspace access' CHANGELOG.md -A 2 -B 2
# Also check for any workspace-related changes in recent changelog entries
rg -n 'workspace' CHANGELOG.md | head -30
Length of output: 986
| def _ensure_default_workspace_experiment(self) -> None: | ||
| """ | ||
| Ensure the default experiment exists in the provider's default workspace when enabled. | ||
| """ | ||
|
|
||
| provider = self._get_workspace_provider_instance() | ||
| default_workspace, supports_default = get_default_workspace_optional( | ||
| provider, logger=_logger | ||
| ) | ||
| provider_name = ( | ||
| type(self._workspace_provider).__name__ if self._workspace_provider else "unknown" | ||
| ) | ||
|
|
||
| if not supports_default: | ||
| _logger.warning( | ||
| "Workspace provider %s does not expose a default workspace; " | ||
| "skipping default experiment bootstrap.", | ||
| provider_name, | ||
| ) | ||
| return | ||
|
|
||
| if default_workspace is None: | ||
| return | ||
|
|
||
| with workspace_context.WorkspaceContext(default_workspace.name): | ||
| if self.get_experiment_by_name(Experiment.DEFAULT_EXPERIMENT_NAME) is None: | ||
| with self.ManagedSessionMaker() as session: | ||
| self._create_default_experiment( | ||
| session, workspace_override=default_workspace.name | ||
| ) | ||
|
|
||
| def _bootstrap_default_experiment_for_workspace(self, workspace: str) -> None: | ||
| """ | ||
| Create the default experiment for the specified workspace in an isolated session. | ||
| """ | ||
| MLflow UI and client code expects a default experiment with ID 0. | ||
|
|
||
| with self.ManagedSessionMaker() as bootstrap_session: | ||
| self._create_default_experiment(bootstrap_session, workspace_override=workspace) | ||
|
|
||
| def _create_default_experiment(self, session, workspace_override: str | None = None): | ||
| """ | ||
| MLflow UI and client code expects a default experiment with ID 0 in single-tenant mode. | ||
| This method uses SQL insert statement to create the default experiment as a hack, since | ||
| experiment table uses 'experiment_id' column is a PK and is also set to auto increment. | ||
| MySQL and other implementation do not allow value '0' for such cases. | ||
| experiment table uses 'experiment_id' column as a PK and is also set to auto increment. | ||
| MySQL and other implementations do not allow value '0' for such cases. | ||
|
|
||
| ToDo: Identify a less hacky mechanism to create default experiment 0 | ||
| """ | ||
| table = SqlExperiment.__tablename__ | ||
| creation_time = get_current_time_millis() | ||
|
|
||
| if self._workspaces_enabled(): | ||
| workspace = workspace_override or self._get_active_workspace() | ||
| existing = ( | ||
| session.query(SqlExperiment) | ||
| .filter( | ||
| SqlExperiment.name == Experiment.DEFAULT_EXPERIMENT_NAME, | ||
| SqlExperiment.workspace == workspace, | ||
| ) | ||
| .one_or_none() | ||
| ) | ||
| if existing is not None: | ||
| return | ||
|
|
||
| experiment = SqlExperiment( | ||
| name=Experiment.DEFAULT_EXPERIMENT_NAME, | ||
| lifecycle_stage=LifecycleStage.ACTIVE, | ||
| artifact_location=None, | ||
| creation_time=creation_time, | ||
| last_update_time=creation_time, | ||
| workspace=workspace, | ||
| ) | ||
| session.add(experiment) | ||
| try: | ||
| session.flush() | ||
| except IntegrityError as exc: | ||
| session.rollback() | ||
| _logger.debug( | ||
| "Default experiment already exists for workspace '%s'; another worker likely " | ||
| "created it. Swallowing IntegrityError: %s", | ||
| workspace, | ||
| exc, | ||
| ) | ||
| return | ||
|
|
||
| if not experiment.artifact_location: | ||
| experiment.artifact_location = self._get_artifact_location( | ||
| experiment.experiment_id, workspace | ||
| ) | ||
| session.flush() | ||
| return | ||
|
|
||
| table = SqlExperiment.__tablename__ |
There was a problem hiding this comment.
Default experiment handling per workspace is robust, but depends on migrations for older DBs
The new _ensure_default_workspace_experiment / _bootstrap_default_experiment_for_workspace / _create_default_experiment flow correctly:
- Honors providers that do not expose a default workspace (warning + early return).
- Creates a default experiment per workspace only if one doesn’t already exist, handling races via
IntegrityErrorswallow. - Keeps the legacy
experiment_id=0creation in non-workspace mode, now withworkspace=DEFAULT_WORKSPACE_NAME.
This assumes that existing experiments have their workspace column consistently populated (e.g., via migrations); otherwise, lookups that filter on SqlExperiment.workspace == workspace could fail for legacy rows with NULL workspace. If you expect legacy DBs, ensure the migrations backfill workspace to DEFAULT_WORKSPACE_NAME.
🤖 Prompt for AI Agents
In mlflow/store/tracking/sqlalchemy_store.py around lines 314 to 405, the new
per-workspace default-experiment logic assumes existing DB rows have workspace
populated; on legacy DBs with NULL workspace this will skip matching defaults.
Add a DB migration that backfills NULL SqlExperiment.workspace values to the
DEFAULT_WORKSPACE_NAME for existing experiments, and, as a safety change in this
function, make the existence check tolerant of legacy NULLs by treating NULL
workspace as the default workspace when workspace == DEFAULT_WORKSPACE_NAME
(i.e., match rows where workspace == workspace OR (workspace IS NULL AND
workspace == DEFAULT_WORKSPACE_NAME)); update any related inserts/reads
similarly so legacy rows are considered default until migration runs.
| def _runs_query(self, session): | ||
| """Return a SqlRun query scoped to the active workspace.""" | ||
|
|
||
| query = session.query(SqlRun) | ||
| if not self._workspaces_enabled(): | ||
| return query | ||
|
|
||
| workspace = self._get_active_workspace() | ||
| return query.join( | ||
| SqlExperiment, SqlExperiment.experiment_id == SqlRun.experiment_id | ||
| ).filter(SqlExperiment.workspace == workspace) | ||
|
|
||
| def _trace_query(self, session, for_update_or_delete=False): | ||
| """Return a SqlTraceInfo query scoped to the active workspace when enabled. | ||
|
|
||
| Args: | ||
| for_update_or_delete: If True, returns a query without joins that can be used | ||
| for update/delete operations. Workspace filtering must be | ||
| done separately in this case. | ||
| """ | ||
| query = session.query(SqlTraceInfo) | ||
| if for_update_or_delete: | ||
| return query | ||
|
|
||
| if not self._workspaces_enabled(): | ||
| return query | ||
|
|
||
| workspace = self._get_active_workspace() | ||
| return query.join( | ||
| SqlExperiment, SqlTraceInfo.experiment_id == SqlExperiment.experiment_id | ||
| ).filter(SqlExperiment.workspace == workspace) | ||
|
|
||
| def _logged_model_query(self, session): | ||
| """Return a SqlLoggedModel query scoped to the active workspace when enabled.""" | ||
| query = session.query(SqlLoggedModel) | ||
| if not self._workspaces_enabled(): | ||
| return query | ||
|
|
||
| workspace = self._get_active_workspace() | ||
| # Use a subquery to filter by workspace without adding a join to the main query | ||
| # This avoids issues with duplicate rows when additional joins are added for filtering | ||
| workspace_experiment_ids = ( | ||
| session.query(SqlExperiment.experiment_id) | ||
| .filter(SqlExperiment.workspace == workspace) | ||
| .subquery() | ||
| ) | ||
| return query.filter( | ||
| SqlLoggedModel.experiment_id.in_(select(workspace_experiment_ids.c.experiment_id)) | ||
| ) | ||
|
|
||
| def _get_logged_model_record( | ||
| self, session: sqlalchemy.orm.Session, model_id: str, include_deleted: bool = False | ||
| ) -> SqlLoggedModel: | ||
| query = self._logged_model_query(session).filter(SqlLoggedModel.model_id == model_id) | ||
| if not include_deleted: | ||
| query = query.filter(SqlLoggedModel.lifecycle_stage != LifecycleStage.DELETED) | ||
|
|
||
| logged_model = query.one_or_none() | ||
| if logged_model is None: | ||
| self._raise_model_not_found(model_id) | ||
| return logged_model | ||
|
|
||
| def _dataset_query(self, session): | ||
| """Return a SqlEvaluationDataset query scoped to the active workspace when enabled.""" | ||
|
|
||
| query = session.query(SqlEvaluationDataset) | ||
| if not self._workspaces_enabled(): | ||
| return query | ||
| return query.filter(SqlEvaluationDataset.workspace == self._get_active_workspace()) | ||
|
|
||
| def _dataset_workspace_filter(self): | ||
| if not self._workspaces_enabled(): | ||
| return sql.true() | ||
| return SqlEvaluationDataset.workspace == self._get_active_workspace() | ||
|
|
||
| def _get_dataset_record( | ||
| self, session: sqlalchemy.orm.Session, dataset_id: str | ||
| ) -> SqlEvaluationDataset: | ||
| dataset = ( | ||
| self._dataset_query(session) | ||
| .filter(SqlEvaluationDataset.dataset_id == dataset_id) | ||
| .one_or_none() | ||
| ) | ||
| if dataset is None: | ||
| raise MlflowException( | ||
| f"Evaluation dataset with id '{dataset_id}' not found", | ||
| RESOURCE_DOES_NOT_EXIST, | ||
| ) | ||
| return dataset | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Workspace-scoped query helpers centralize and simplify isolation
The new helpers:
_runs_query— scopesSqlRunby joiningSqlExperimentonexperiment_idand filtering by workspace when enabled._trace_query— returns either a rawSqlTraceInfoquery (for_update_or_delete=True) or a workspace-scoped join withSqlExperiment._logged_model_query— uses a workspace-scoped experiment-id subquery to avoid duplicate rows when additional joins are added._dataset_query/_dataset_workspace_filter/_get_dataset_record— encapsulate all evaluation dataset workspace filtering.
These abstractions are well-designed and are reused appropriately throughout the file, making future workspace changes easier.
🤖 Prompt for AI Agents
In mlflow/store/tracking/sqlalchemy_store.py around lines 4915 to 5004, the new
workspace-scoped query helper methods (_runs_query, _trace_query,
_logged_model_query, _dataset_query, _dataset_workspace_filter,
_get_dataset_record) are implemented and reviewed okay; no functional code
changes required—ensure all call sites use these helpers consistently and
add/update unit/integration tests covering workspace-enabled and
workspace-disabled paths to validate behavior.
| assert "scorer_permissions" in tables | ||
| assert "registered_model_permissions" in tables | ||
| assert version[0] == "8606fa83a998" | ||
| assert version[0] == "2ed73881770d" |
There was a problem hiding this comment.
🧩 Analysis chain
Verify completeness of upgrade assertions and Alembic version.
Two concerns:
-
The test verifies
scorer_permissionsis added during upgrade (line 121), but doesn't verifyworkspace_permissionsis added, even though the other two tests expect this table to exist. If the upgrade process addsworkspace_permissions, it should be asserted here as well. -
The Alembic version changed from "8606fa83a998" to "2ed73881770d" - please verify this matches the actual migration version that adds workspace support.
Apply this diff to add the missing assertion:
assert "alembic_version_auth" in tables
assert "users" in tables
assert "experiment_permissions" in tables
assert "scorer_permissions" in tables
assert "registered_model_permissions" in tables
+ assert "workspace_permissions" in tables
assert version[0] == "2ed73881770d"
assert user == ("testuser", 1)Run the following script to verify the Alembic version corresponds to the workspace migration:
🏁 Script executed:
#!/bin/bash
# Description: Verify the Alembic version in the migration files
# Search for the migration file with version 2ed73881770d
fd -e py . | xargs rg -l "2ed73881770d"
# Show the migration file content to verify it includes workspace_permissions
fd -e py . | xargs rg -A 5 -B 5 "2ed73881770d"Length of output: 2373
Let me examine the actual migration file to verify what it creates:
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Display the full migration file to see what it creates
cat mlflow/server/auth/db/migrations/versions/2ed73881770d_workspace_permissions.pyLength of output: 3744
Now let me verify what the test actually does - specifically whether it upgrades all the way to version 2ed73881770d:
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Show the test_upgrade_from_legacy_database function to see what migrations it applies
rg -A 70 "def test_upgrade_from_legacy_database" tests/server/auth/db/test_cli.pyLength of output: 2390
Let me verify that scorer_permissions is created in an earlier migration to confirm the pattern:
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Find the migration that creates scorer_permissions
fd -e py . mlflow/server/auth/db/migrations/versions/ | xargs rg -l "scorer_permissions" | head -5Length of output: 80
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Verify the scorer_permissions migration creates the table
cat mlflow/server/auth/db/migrations/versions/0965eb92f5f0_add_scorer_permissions.pyLength of output: 996
Add missing assertion for workspace_permissions table.
The test upgrades to Alembic version 2ed73881770d, which creates both scorer_permissions (in migration 0965eb92f5f0) and workspace_permissions (in migration 2ed73881770d). The test currently asserts scorer_permissions exists but omits the assertion for workspace_permissions. For consistency and completeness, add the missing assertion:
assert "alembic_version_auth" in tables
assert "users" in tables
assert "experiment_permissions" in tables
assert "scorer_permissions" in tables
assert "registered_model_permissions" in tables
+ assert "workspace_permissions" in tables
assert version[0] == "2ed73881770d"
assert user == ("testuser", 1)The Alembic version "2ed73881770d" is correct.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| assert "scorer_permissions" in tables | |
| assert "registered_model_permissions" in tables | |
| assert version[0] == "8606fa83a998" | |
| assert version[0] == "2ed73881770d" | |
| assert "scorer_permissions" in tables | |
| assert "registered_model_permissions" in tables | |
| assert "workspace_permissions" in tables | |
| assert version[0] == "2ed73881770d" |
🤖 Prompt for AI Agents
In tests/server/auth/db/test_cli.py around lines 121 to 123, the test asserts
presence of "scorer_permissions" but omits asserting the newly created
"workspace_permissions" table for Alembic version 2ed73881770d; add an assertion
that "workspace_permissions" is in the tables collection (placed alongside the
existing scorer_permissions assertion) so the test verifies both permissions
tables exist while leaving the existing version[0] equality check unchanged.
| def _build_endpoint_case_id(handler_name: str, path: str) -> str: | ||
| """ | ||
| Generate a stable pytest param id by combining the handler name with a sanitized | ||
| version of the route path (non-word characters converted to underscores). | ||
| """ | ||
| sanitized_path = re.sub(r"[\\W]+", "_", path.strip("/")) | ||
| return f"{handler_name}__{sanitized_path or 'root'}" | ||
|
|
There was a problem hiding this comment.
Regex in _build_endpoint_case_id does not match its documented behavior
The docstring says “non-word characters converted to underscores”, but the pattern r"[\\W]+" only replaces backslashes and the literal character 'W', not all non-word characters. This is at odds with the intent and could yield odd-looking or less-stable test IDs.
You likely meant to use \W inside the character class:
- sanitized_path = re.sub(r"[\\W]+", "_", path.strip("/"))
+ sanitized_path = re.sub(r"[\W]+", "_", path.strip("/"))While this only affects test IDs (not behavior), aligning the regex with the stated intent will avoid confusion and ensure consistent param IDs across paths.
🤖 Prompt for AI Agents
In tests/server/test_handlers.py around lines 338 to 345, the regex r"[\\W]+" in
_build_endpoint_case_id incorrectly matches backslash and 'W' instead of
non-word characters; replace it with a pattern that actually matches non-word
characters (e.g. use r"\W+" in re.sub) so that all non-word characters are
converted to underscores and the generated pytest param ids match the docstring
intent.
Signed-off-by: mprahl <mprahl@users.noreply.github.com>
d6b41b1 to
40b0e90
Compare
Summary by CodeRabbit
Release Notes
New Features
Configuration
--enable-workspacesserver flag andMLFLOW_ENABLE_WORKSPACESenvironment variableDocumentation
✏️ Tip: You can customize this high-level summary in your review settings.